Skip to content

feat(build): migrate build support scripts#12

Merged
leaves12138 merged 5 commits into
apache:mainfrom
zjw1111:migrate/build-support
May 25, 2026
Merged

feat(build): migrate build support scripts#12
leaves12138 merged 5 commits into
apache:mainfrom
zjw1111:migrate/build-support

Conversation

@zjw1111
Copy link
Copy Markdown
Contributor

@zjw1111 zjw1111 commented May 25, 2026

Purpose

Linked issue: close #xxx

Migrate the build_support tooling from the Alibaba-origin Paimon C++ repository, excluding build_support/fix_includes.py as requested.

This PR includes:

  • sanitizer suppression files
  • clang-format and clang-tidy helper scripts
  • IWYU scripts and mapping files
  • lint utilities and exclusions
  • test runner and stacktrace helper scripts
  • upstream commit helper script

Migration notes:

  • Requested files migrated: all files under build_support/ except fix_includes.py
  • Skipped requested files: none
  • Extra dependency files: none
  • License handling: preserved existing ASF headers and original third-party/source license headers for LLVM/UIUC-derived and Cloudera-derived files
  • External contributors: no external contributor threshold hits; no Co-authored-by trailer added

Tests

  • git diff --cached --check
  • bash -n build_support/get-upstream-commit.sh build_support/run-test.sh build_support/iwyu/iwyu.sh
  • perl -c build_support/stacktrace_addr2line.pl
  • python -B -m py_compile build_support/asan_symbolize.py build_support/iwyu/iwyu_tool.py build_support/lintutils.py build_support/run_clang_format.py build_support/run_clang_tidy.py
  • Migration preflight: 22 files, 1911 source lines, no missing internal dependencies
  • External contributor analysis: no threshold hits

API and Format

No API, storage format, or protocol changes.

Documentation

No user-facing documentation changes.

Generative AI tooling

Migrated-by: OpenAI Codex (GPT-5)

Copilot AI review requested due to automatic review settings May 25, 2026 06:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a collection of build support scripts and configuration files (suppressions for sanitizers, IWYU mappings/wrappers, lint runners, test harness, and symbolization helpers) under build_support/.

Changes:

  • Adds sanitizer suppression files (asan, lsan, tsan, ubsan, sanitizer-disallowed-entries) and a test runner shell script integrating them.
  • Adds Python lint/format/IWYU runner scripts (run_clang_tidy.py, run_clang_format.py, iwyu_tool.py, iwyu.sh, lintutils.py) with related mapping files.
  • Adds stacktrace symbolization helpers (asan_symbolize.py, stacktrace_addr2line.pl) and a git upstream-commit helper.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
build_support/ubsan-suppressions.txt Empty UBSAN suppressions placeholder.
build_support/tsan-suppressions.txt Empty TSAN suppressions placeholder.
build_support/stacktrace_addr2line.pl Perl helper to symbolize stack-trace addresses.
build_support/sanitizer-disallowed-entries.txt Sanitizer disallowed-functions list for gmock workaround.
build_support/run_clang_tidy.py Driver for running clang-tidy in parallel.
build_support/run_clang_format.py Driver for running clang-format / producing diffs.
build_support/run-test.sh Shell wrapper to run tests with sanitizer setup, retries, core dumps.
build_support/lsan-suppressions.txt LSAN suppression for libc atexit false positive.
build_support/lintutils.py Shared helpers for lint scripts (chunking, parallel exec, file scanning).
build_support/lint_exclusions.txt Lint exclusion globs.
build_support/iwyu/mappings/*.imp IWYU mapping definitions.
build_support/iwyu/iwyu_tool.py Imported IWYU driver script.
build_support/iwyu/iwyu.sh Shell wrapper to run IWYU on affected/matching files.
build_support/iwyu/iwyu-filter.awk AWK filter for IWYU output.
build_support/get-upstream-commit.sh Returns most recent upstream-format commit hash.
build_support/asan_symbolize.py Imported ASan symbolizer script.
build_support/asan-suppressions.txt Empty ASan suppressions placeholder.
Comments suppressed due to low confidence (1)

build_support/iwyu/iwyu.sh:1

  • The "all" branch leaves a commented-out pipe through iwyu-filter.awk while the other two branches still apply the filter. This is asymmetric and likely unintentional; either drop the dead comment or restore the filter for consistency.
#!/bin/bash

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread build_support/run-test.sh
pipe_cmd=cat

function setup_sanitizers() {
# Sets environment variables for different sanitizers (it configures how) the run_tests. Function works.
Comment thread build_support/run-test.sh
LOGFILE=$TEST_LOGDIR/$TEST_NAME.txt
XMLFILE=$TEST_LOGDIR/$TEST_NAME.xml

TEST_EXECUTION_ATTEMPTS=1
Comment thread build_support/run-test.sh
Comment on lines +192 to +193
if [ $ATTEMPT_NUMBER -lt $TEST_EXECUTION_ATTEMPTS ]; then
# If the test fails, the test output may or may not be left behind,
Comment thread build_support/run-test.sh
Comment on lines +199 to +204
TEST_TMPDIR_BEFORE=$(find $TEST_TMPDIR -maxdepth 1 -type d | sort)
fi

if [ $ATTEMPT_NUMBER -lt $TEST_EXECUTION_ATTEMPTS ]; then
# Now delete any new test output.
TEST_TMPDIR_AFTER=$(find $TEST_TMPDIR -maxdepth 1 -type d | sort)
Comment thread build_support/run-test.sh
Comment on lines +62 to +63
# Remove both the uncompressed output, so the developer doesn't accidentally get confused
# and read output from a prior test run.
for line in output:
match = CORRECT_RE.match(line)
if match:
print('%s:1:1: note: #includes/fwd-decls are correct', match.groups(1))
def __init__(self, filename):
super(BreakpadSymbolizer, self).__init__()
self.filename = filename
lines = file(filename).readlines()
Comment thread build_support/run-test.sh
Comment on lines +2 to +15
# Copyright 2014 Cloudera, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Comment on lines +113 to +118
for line in _remove_color(stdout).splitlines():
for filename in bfilenames:
if line.startswith(filename):
problem_files.add(filename)
bfilenames.remove(filename)
break
Comment on lines +136 to +137
pool.join()
sys.exit(1 if error else 0)
Copy link
Copy Markdown

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for migrating the build support scripts. I found a licensing blocker that needs to be resolved before merge.

The PR adds files with non-ASF third-party license/copyright notices, for example build_support/asan_symbolize.py and build_support/iwyu/iwyu_tool.py carry the LLVM/UIUC license text, while build_support/run-test.sh and build_support/stacktrace_addr2line.pl carry Cloudera copyright notices. The current LICENSE update only lists build_support/ under the Apache Arrow section, so these additional third-party notices are not represented accurately.

Please update LICENSE/NOTICE to account for these third-party files and their original licenses/copyrights, or remove/replace those files if they are not needed.

Copy link
Copy Markdown

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed the latest update. The previous licensing blocker is addressed: LICENSE now identifies the Apache Arrow-derived build support files, the LLVM/compiler-rt UIUC-licensed symbolizer, the IWYU UIUC-licensed tool, and the Apache Kudu/Cloudera Apache-licensed utilities. I do not see further blockers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants